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Abstract 

Wc revisit classic algorithmic search and optimization problems from the perspective of 
competition. Rather than a single optimizer minimizing expected cost, we consider a zero- 
sum game in which an optimization problem is presented to two players, whose only goal is 
to outperform the opponent. Such games are typically exponentially large zero-sum games, but 
they often have a rich combinatorial structure. We provide general techniques by which such 
structure can be leveraged to find minmax-optimal and approximate minmax-optimal strategies. 
We give examples of ranking, hiring, compression, and binary search duels, among others. We 
give bounds on how often one can beat the classic optimization algorithms in such duels. 
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1 Introduction 



Many natural optimization problems have two-playcr competitive analogs. For example, con- 
sider the ranking problem of selecting an order on n items, where the cost of searching for a 
single item is its rank in the list. Given a fixed probability distribution over desired items, the 
trivial greedy algorithm, which orders items in decreasing probability, is optimal. 

Next consider the following natural two-player version of the problem, which models a user 
choosing between two search engines. The user thinks of a desired web page and a query and 
executes the query on both search engines. The engine that ranks the desired page higher is cho- 
sen by the user as the "winner." If the greedy algorithm has the ranking of pages wi, cj2, ■ • • , i^n, 
then the ranking W2,t^3, . . • ,LUn,LJi beats the greedy ranking on every item except wi. We say 
the greedy algorithm is 1 — 1/n beatable because there is a probability distribution over pages 
for which the greedy algorithm loses 1 — 1/n of the time. Thus, in a competitive setting, an 
"optimal" search engine can perform poorly against a clever opponent. 

This ranking duel can be modeled as a symmetric constant-sum game, with n\ strategics, 
in which the player with the higher ranking of the target page receives a payoff of 1 and the 
other receives a payoff of (in the case of a tie, say they both receive a payoff of 1/2). As in all 
symmetric one-sum games, there must be (mixed) strategies that guarantee expected payoff of 
at least 1/2 against any opponent. Put another way, there must be a (randomized) algorithm 
that takes as input the probability distribution and outputs a ranking, which is guaranteed to 
achieve expected payoff of at least 1/2 against any opposing algorithm. 

This conversion can be applied to any optimization problem with an element of uncertainty. 
Such problems are of the form min^^gx Et^^p[c(a;, oj)], where p is a probability distribution over 
the state of nature w S f2, X is a feasible set, and c : X x — >■ R is an objective function. 
The dueling analog has two players simultaneously choose x,x'; player 1 receives payoff 1 if 
c{x,uj) < c{x',uj), payoff if c(x,u)) > c(x',a;), payoff 1/2 otherwise, and similarly for player 20 

There are many natural examples of this setting beyond the ranking duel mentioned above. 
For example, for the shortest-path routing under a distribution over edge times, the correspond- 
ing racing duel is simply a race, and the state of nature encodes uncertain edge dclays|j For 
the classic secretary problem, in the corresponding hiring duel two employers must each select a 
candidate from a pool of n candidates (though, as standard, they must decide whether or not to 
choose a candidate before interviewing the next one), and the winner is the one that hires the 
better candidate. This could model, for example, two competing companies attempting to hire 
CEOs or two opposing political parties selecting politicians to run in an election; the absolute 
quality of the candidate may be less important than being better than the other's selection. 
In a compression duel, a user with a (randomly chosen) sample string uj chooses between two 
compression schemes based on which one compresses that string better. This setting can also 
model a user searching for a file in two competing, hierarchical storage systems and choosing 
the system that finds the file first. In a binary search duel, a user searches for a random element 
in a list using two different search trees, and chooses whichever tree finds the element faster. 

Our contribution. For each of these problems, we consider a number of questions related 
to how vulnerable a classic algorithm is to competition, what algorithms will be selected at 
equilibrium, and how well these strategies at equilibrium solve the original optimization problem. 

Question 1. Will players use the classic optimization solution in the dueling setting? 

Intuitively, the answer to this question should depend on how much an opponent can game 
the classic optimization solution. For example, in the ranking duel an opponent can beat the 
greedy algorithm on almost all pages - and even the most oblivious player would quickly realize 
the need to change strategies. In contrast, we demonstrate that many classic optimization 

^Our techniques will also apply to asymmetric payoff functions; see Appendix [Dl 

^ We also refer to this as the primal duel because any other duel can be represented as a race with an appropriate 
graph and probability distribution p, though there may be an exponential blowup in representation size. 
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solutions - such as the secretary algorithm for hiring, Huffman coding for compression, and 
standard binary search - are substantially less vulnerable. We say an algorithm is /3-beatable 
(over distribution p) if there exists a response which achieves payoff /? against that algorithm 
(over distribution p) . We summarize our results on the beatability of the standard optimization 
algorithm in each of our example optimization problems in the table below: 



Optimization Problem 


Upper Bound 


Lower Bound 


Ranking 


1 - 1/n 


1 - 1/n 


Racing 


1 


1 


Hiring 


0.82 


0.51 


Compression 


3/4 


2/3 


Search 


5/8 


5/8 



Question 2. What strategies do players play at equilibrium? 

We say an algorithm efficiently solves the duel if it takes as input a representation of the 
game and probability distribution p, and outputs an action x € X distributed according to 
some minmax optimal (i.e., Nash equilibrium) strategy. As our main result, we give a general 
method for solving duels that can be represented in a certain bilinear form. We also show how 
to convert an approximate best-response oracle for a dueling game into an approximate minmax 
optimal algorithm, using techniques from low-regret learning. We demonstrate the generality of 
these methods by showing how to apply them to the numerous examples described above. For 
many problems we consider, the problem of computing minmax optimal strategies reduces to 
finding a simple description of the space of feasible mixed strategies (i.e. expressing this set as 
the projection of a polytope with polynomially many variables and constraints). See |18] for a 
thorough treatment of such problems. 

Question 3. Are these equilibrium strategies still good at solving the optimization problem? 

As an example, consider the ranking duel. How much more time does a web surfer need 
to spend browsing to find the page he is interested in, because more than one search engine is 
competing for his attention? In fact, the surfer may be better off due to competition, depending 
on the model of comparison. For example, the cost to the web surfer may be the minimum of 
the ranks assigned by each search engine. And we leave open the tantalizing possibility that 
this quantity could in general be smaller at equilibrium for two competing search engines than 
for just one search engine playing the greedy algorithm. 

Related work. The work most relevant to ours is the study of ranking games 0], and 
more generally the study of social context games [1]. In these settings, players' payoffs are 
translated into utilities based on social contexts, defined by a graph and an aggregation function. 
For example, a player's utility can be the sum/max/min of his neighbors' payoffs. This work 
studies the effect of social contexts on the existence and computation of game-theoretic solution 
concepts, but does not re-visit optimization algorithms in competitive settings. 

For the hiring problem, several competitive variants and their algorithmic implications have 
been considered (see, e.g., [10] and the references therein). A typical competitive setting is a 
(general sum) game where a player achieves payoff of 1 if she hires the very best applicant and 
zero otherwise. But, to the best of our knowledge, no one has considered the natural model of 
a duel where the objective is simply to hire a better candidate than the opponent. Also related 
to our algorithmic results are succinct zero-sum games, where a game has exponentially many 
strategies but the payoff function can be computed by a succinct circuit. This general class has 
been showed to be EXP-hard to solve [6], and also difficult to approximate [7]. 

Finally, we note the line of research on competition among mechanisms, such as the study 
of competing auctions (see e.g. [SJ [121 [HI [H]) or schedulers [2]. In such settings, each player 
selects a mechanism and then bidders select the auction to participate in and how much to bid 
there, where both designers and bidders are strategic. This work is largely concerned with the 
existence of sub-game perfect equilibrium. 
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Outline. In Section [5] we define our model formally and provide a general framework for 
solving dueling problems as well as the warmup example of the ranking duel. We then use these 
tools to analyze the more intricate settings of the hiring duel (Section , the compression duel 
(Section |4]), and the search duel (Section [5]). We describe avenues of future research in Section |6l 

2 Preliminaries 

A problem of optimization under uncertainty, (X, 51, c,p), is specified by a feasible set X, a 
commonly-known distribution p over the state of nature, w, chosen from set il, and an objective 
function c : X x ft ^ Ti. For simplicity we assume all these sets are finite. When p is clear from 
context, we write the expected cost of x G X as c{x) = E^^p[c(a;, w)]. The one-player optimum 
is opt = imTix£xc{x). Algorithm A takes as input p and randomness r G [0,1], and outputs 
X € X. We define c{A) ^ Fir[c{A(p, r))] and an algorithm A is one-player optimal if c{A) = opt. 

In the two-person constant-sum duel game D{X,i},c,p), players simultaneously choose 
x,x' £ X, and player I's payoff is: 



When p is understood from context we write w(x, x'). Player 2's payoff is v{x' , x) = 1 — v{x, x'). 
This models a tic, c{x,uj) = c{x' ,uj), as a half point for each. We define the value of a strategy, 
v{x,p)^ to be how much that strategy guarantees, v{x,p) — minx'srx v{x,x' ,p). Again, when p 
is understood from context wc write simply v{x). 

The set of probability distributions over set S is denoted A(5). A mixed strategy is a G A(X). 
As is standard, we extend the domain of v to mixed strategies bilinearly by expectation. A 
best response to mixed strategy ct is a strategy which yields maximal payoff against tr, i.e., 
a' is a best response to tj if it maximizes w(ct',ct). A minmax strategy is a (possibly mixed) 
strategy that guarantees the safety value, in this case 1/2, against any opponent play. The 
best response to such a strategy yields payoffs of 1/2. The set of minmax strategics is denoted 
MM{D{X, il, c,p)) — {a E A(A) | w(<t) = 1/2}. A basic fact about constant-sum games is that 
the set of Nash equilibria is the cross product of the minmax strategics for player 1 and those 
of player 2. 

2.1 Bilinear duels 

In a bilinear duel, the feasible set of strategies are points in n-dimensional Euclidean space, i.e., 
X C R", X' C R"' and the payoff to player 1 is v{x, x') = x^Mx' for some matrix M e R"^"'. 
In n X n bimatrix games, X and X' are just simplices {x G R"q | = 1}. Let K be the 

convex hull of X . Any point in K is achievable (in expectation) as a mixed strategy. Similarly 
define K' . As we will point out in this section, solving these reduces to linear programming 
with a number of constraints proportional to the number of constraints necessary to define the 
feasible sets, K and A''. (In typical applications, K and K' have a polynomial number of facets 
but an exponential number of vertices.) 

Let AT be a polytope defined by the intersection of m halfspaccs. K ~ {x G R" | Wi ■ x > 
hi for i ~ 1,2, .. . ,m}. Similarly, let K' be the intersection of m' halfspaces w[ ■ x > h[. The 
typical way to reduce to an LP for constant-sum games is: 



The above program has a number of constraints which is m-\- {m constraints guaranteeing 
that X G AT), and \X'\ is typically exponential. Instead, the following linear program has 
0(n' + m + m') constraints, and hence can be found in time polynomial in n' ,m,m' and the 
bit-size representation of AI and the constraints in K and K' . 



v{x,x\p) = Pr [c{x,u!) < c{x\uj)] + - Pr [c{x,uj) = c(a;',a;)]. 




max V such that a; G AT and x'^Mx' > v for all x' G X' . 

v£'R,x£'R" 



max 




(1) 



1 



1 
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Lemma 1. For any constant-sum game with strategies x g K^x' e K and payoffs x^Mx' , the 
maximum of the above linear program is the value of the game to player 1, and any maximizing 
X is a minmax optimal strategy. 

Proof. First we argue that the value of the above LP is at least as large as the value of the game 
to player 1. Let x, A maximize the above LP and let the maximum be a. For any x' G K' , 

m' m' 

x^Mx' = ^ AjW- • a;' > ^ A,6- = a. 
1 1 

Hence, this means that strategy x guarantees player x at least a against any opponent response, 
x' G K. Hence a < v with equality iff x is minmax optimal. Next, let x be any minmax optimal 
strategy, and let v be the value of the constant-sum game. This means that x*Mx' > v for all 
x' £ K' with equality for some point. In particular, the minmax theorem (equivalently, duality) 
means that the LP mirix'^K' Mx' has a minimum value of v and that there is a vector of 
A > such that ^™ Xiw'^ = x*M and ^™ Xib'^ = v. Hence a > v. □ 

2.2 Reduction to bilinear duels 

The sets X in a duel arc typically objects such as paths, trees, rankings, etc., which are not 
themselves points in Euclidean space. In order to use the above approach to reduce a given duel 
D{X,fl,c,p) to a bilinear duel in a computationally efficient manner, one needs the following: 

1. An efhciently computable function (f) : X ^ K which maps any x E X to a feasible point 
in if C R". 

2. A payoff matrix M demonstrating such that v{x,x') = (j>{xY M(j){x'), demonstrating that 
the problem is indeed bilinear. 

3. A set of polynomially many feasible constraints which defines K. 

4. A "randomized rounding algorithm" which takes as input a point in K outputs an object 
in X. 

In many cases, parts (1) and (2) are straightforward. Parts (3) and (4) may be more challenging. 
For example, for the binary trees used in the compression duel, it is easy to map a tree to a 
vector of node depths. However, we do not know how to efficiently determine whether a given 
vector of node depths is indeed a mixture over trees (except for certain types of trees which are 
in sorted order, like the binary search trees in the binary search duel). In the next subsection, 
we show how computing approximate best responses suffices. 

2.3 Approximating best responses and approximating minmax 

In some cases, the polytope K may have exponentially or infinitely many facets, in which case 
the above linear program is not very useful. In this section, we show that if one can compute 
approximate best responses for a bilinear duel, then one can approximate minmax strategies. 

For any e > 0, an e-best response to a player 2 strategy x' G K' is any x G K such that 
x*'Mx' > miiiy^K y^ Mx' — e. Similarly for player 1. An e-minmax strategy x e K for player 1 
is one that guarantees player 1 an expected payoff not worse than e minus the value, i.e., 

min vix, x') > max min viv, x') — e. 
x'eK ^ ' ' - y&K x'eK ^ 

Best response oracles are functions from K to /\ ' and vice versa. However, for many appli- 
cations (and in particular the ones in this paper) where all feasible points are nonnegative, one 
can define a best response oracle for all nonnegative points in the positive orthant. (With ad- 
ditional effort, one can remove this assumption using Klcinbcrg and Awcrbuch's elegant notion 
of a Baryccntric spanner [3].) For scaling purposes, we assume that for some B > 0, the convex 
sets are K C [0, B]" and K' C [0, B]"' and the matrix M e [-B, B]"^"' is bounded as well. 
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Fix any e > 0. We suppose that we are given an e-approximate best response oracle in the 
following sense. For player 1, this is an oracle O : \0,B]" — )- K which has the property that 
0{x'yMx' > max^^K x^Mx' - e for any x' G [0,B]" . Similarly for O' for player 2. Hence, one 
is able to potentially respond to things which are not feasible strategies of the opponent. As 
can be seen in a number of applications, this docs not impose a significant additional burden. 

Lemma 2. For any e > 0, n, n' > 1, _B > 0, and any bilinear dual with convex K C [0,J3]" 
and K' C [0,i?]" and M S [— , and any e-best response oracles, there is an algorithm 
for finding (24:{ema,x{m,m')y/^B'^{nn'y/^^-minmax strategies x € K,x' £ K' . The algorithm 
uses poly(/3, m, m', 1/e) runtime and make poly(/3, m, m', 1/e) oracle calls. 

The reduction and proof is deferred to Appendix [X] It uses Hannan-type of algorithms, 
namely "Follow the expected leader" [TTj . 

We reduce the compression duel, where the base objects are trees, to a bilinear duel and use 
the approximate best response oracle. To perform such a reduction, one needs the following. 

1. An efficiently computable function cf) : X ^ K which maps any a; G A" to a feasible point 
in K CW. 

2. A bounded payoff matrix M demonstrating such that v{x,x') = (j>{xY M(f>{x'), demon- 
strating that the problem is indeed bilinear. 

3. e-best response oracles for players 1 and 2. Here, the input to an e best response oracle 
for player 1 is x' € [0, B]" . 

2.4 Beatability 

One interesting quantity to examine is how well a one-player optimization algorithm performs in 
the two-player game. In other words, if a single player was a monopolist solving the one-player 
optimization problem, how badly could they be beaten if a second player suddenly entered. For 
a particular onc-player-optimal algorithm A, we define its beatability over distribution p to be 
Er[v{A{p,r),p)], and we define its beatability to be mfpFir[v(A{p, r) , p)]. 

2.5 A warmup: the ranking duel 

In the ranking duel, O = [n] = {1, 2, . . . ,n}, X is the set of permutations over n items, and 
c{tt,uj) G [n] is the position of cj in tt (rank 1 is the "best" rank). The greedy algorithm, which 
outputs permutation (wi, W2, . . • such that p{lji) > p{i^2) > • • • > p{^n), is optimal in the 
one-player version of the problemlfl 

This game can be represented as a bilinear duel as follows. Let K and K' be the set of doubly 
stochastic matrices. K — K' = {x G R>o | Vj Xij = 1, Vi Xij = 1}. Here xij indicates the 
probability that item i is placed in position j, in some distribution over rankings. The Birkhoff- 
von Neumann Theorem states that the set K is precisely the set of probability distributions 
over rankings (where each ranking is represented as a permutation matrix x G {0, 1}" ), and 
moreover any such x £ K can be implemented efficiently via a form of randomized rounding. 
See, for example. Corollary 1.4.15 of [14]. Note if is a polytope in dimensions with 0(n) 
facets. In this representation, the expected payoff of x versus x' is 

^Pi^) f^Pr[Equally rank i] + Pr[Pl ranks i higher] j = ^p{i)^x^J I ^x'^ + ^ a; -j, 

The above is clearly bilinear in x and x' and can be written as x^Mx' for some matrix M with 
bounded coefficients. Hence, we can solve the bilinear duel by the linear program ([1]) and round 
it to a (randomized) minmax optimal algorithm for ranking. 

^In some cases, such as a model of competing search engines, one could have the agents rank only k items, but 
the algorithmic results would be similar. 
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We next examine the beatability of the greedy algorithm. Note that for the uniform prob- 
abihty distribution p{l) = p{2) = ... = p{n) ~ ^/n, the greedy algorithm outputting, say, 
(1,2,..., n) can be beaten with probability 1 — I/71 by the strategy (2, 3, ... , n, 1). One can 
make greedy's selection unique by setting p(i) = 1/n + (i — 'T./2)e, and for sufhcient small e 
greedy can be beaten a fraction of time arbitrarily close to 1 — \/n. 

3 Hiring Duel 

In a hiring duel, there are two employers A and B and two corresponding sets of workers 
Ua = {oi, . . . , a„} and Ub = {bi, . . . ,hn} with n workers each. The i'th worker of each set has 
a common value v{i) where v{i) > v[j) for all i and j > i. Thus there is a total ranking of 
workers e Ua (similarly bi £ Ub) where a rank of 1 indicates the best worker, and workers are 
labeled according to rank. The goal of the employers is to hire a worker whose value (equivalently 
rank) beats that of his competitor's worker. Workers are interviewed by employers one-by-one 
in a random order. The relative ranks of workers are revealed to employers only at the time 
of the interview. That is, at time i, each employer has seen a prefix of the interview order 
consisting of i of workers and knows only the projection of the total ranking on this prefix|f| 
Hiring decisions must be made at the time of the interview, and only one worker may be hired. 
Thus the employers' pure strategies are mappings from any prefix and permutation of workers' 
ranks in that prefix to a binary hiring decision. We note that the permutation of ranks in a 
prefix does not effect the distribution of the rank of the just-interviewed worker, and hence 
without loss of generality we may assume the strategies are mapings from the round number 
and current rank to a hiring decision. 

In dueling notation, our game is (X, f2,c,p) where the elements of X are functions h : 
{1, . . . , n}^ — > {0, 1} indicating for any round i and projected rank of current interviewee j < i 
the hiring decision /i(i,j); fi is the set ((Ta,ctb) of all pairs of permutations of tOi andC/s; c{h,a) 
is the value w(tT~^(i*)) of the first candidate i* = argminjji : h{i, [<J~^{i)]i) = 1} (where [cr"^(«)]j 
indicates the projected rank of the i'th candidate among the first j candidates according to a) 
that received an offer; and p (as is typical in the secretary problem) is the uniform distribution 
over n. The mixed strategies tt S A(X) are simply mappings tt : {0, — !> [0,1] from 
rounds and projected ranks to a probability 7r(z, j) of a hiring decision. 

The values v(-) may be chosen adversarially, and hence in the one-player setting the optimal 
algorithm against a worst-case v{-) is the one that maximizes the probability of hiring the 
best worker (the worst-case values set v{l) = 1 and v{i) « 1 for j > 1). In the literature 
on secretary problems, the following classical algorithm is known to hire the best worker with 
probability approaching Interview n/e workers and hire next one that beats all the previous. 
Furthermore, there is no other algorithm that hires the best worker with higher probability. 

3.1 Common pools of workers 

In this section, we study the common hiring duel in which employers see the same candidates 
in the same order so that aA = ctb and each employer observes when the other hires. In this 
case, the following strategy tt is a symmetric equilibrium: If the opponent has already hired, 
then hire anyone who beats his employee; otherwise hire as soon as the current candidate has 
at least a 50% chance of being the best of the remaining candidates. 

Lemma 3. Strategy tt is efficiently computable and constitutes a symmetric equilibrium of the 
common hiring duel. 

The computability follows from a derivation of probabilities in terms of binomials, and the 
equilibrium claim follows by observing that there can be no profitable deviation. This strategy 

''in some cases, an employer also knows when and whom his opponent hired, and may condition his strategy on 
tills information as well. Only one of the settings described below needs this knowledge set; hence we defer our 
discussion of this point for now and explicitly mention the necessary assumptions where appropriate. 
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also beats the classical algorithm, enabling us to provide non-trivial lower and upper bounds for 
its beatability. 

Proof. For a round i, we compute a threshold ti such that tt hires if and only if the projected 
rank of the current candidate j is at most U. Note that if i candidates are observed, the 
probability that the t^'th best among them is better than all remaining candidates is precisely 
(**)/(")■ "^^^ numerator is the number of ways to place the 1 through i^'th best candidates 
overall among the first i and the denominator is the number of ways to place the 1 through 
ti'th best among the whole order. Hence to efficiently compute tt we just need to compute ti 
or, equivalently, estimate these ratios of binomials and hire whenever on round i and observing 



Lemma 4. The beatability of the classical algorithm is at least 0.51 and at most 0.82. 

The lower bound follows from the fact that tt beats the classical algorithm with probability 
bounded above 1/2 when the classical algorithm hires early (i.e., before round n/2), and the 
upper bound follows from the fact that the classical algorithm guarantees a probability of 1/e 
of hiring the best candidate, in which case no algorithm can beat it. 

Proof. For the lower bound, note that in any event, tt guarantees a payoff of at least 1/2 
against the classical algorithm. We next argue that for a constant fraction of the probability 
space, TT guarantees a payoff of strictly better than 1/2. In particular, for some q,l/e < q < 1/2, 
consider the event that the classical algorithm hires in the interval {n/e, qn}. This event happens 
whenever the best among the first qn candidates is not among the first n/e candidates, and 
hence has a probability of (1 — 1/qe). Conditioned on this event, tt beats the classical algorithm 
whenever the best candidate overall is in the last n{l — q) candidates^ which happens with 
probability (1 — q) (the conditioning does not change this probability since it is only a property 
of the permutation projected onto the first qn elements). Hence the overall payoff of tt against 
the classical algorithm is (1 — q){l — l/qe) + {l/2){l/qe). Optimizing for q yields the result. 

For the upper bound, note as mentioned above that the classical algorithm has a probability 
approaching 1/e of hiring the best candidate. From here, we see ((l/2e) + (1 — 1/e)) = 1 — l/2e < 
0.82 is an upper bound on the beatability of the classical algorithm since the best an opponent 
can do is always hire the best worker when the classical algorithm hires the best worker and 
always hire a better worker when the classical algorithm does not hire the best worker. □ 

3.2 Independent pools of workers 

In this section, we study the independent hiring duel in which the employers see different can- 
didates. Thus (JA ctb and the employers do not see when the opponent hires. We use the 
bilinear duel framework introduced in Section 12.11 to compute an equilibrium for this setting, 
yielding the following theorem. 

Theorem 1. The equilibrium strategies of the independent hiring duel are efficiently computable. 

The main idea is to represent strategies tt by vectors {pij} where pij is the (total) probability 
of hiring the j'th best candidate seen so far on round i. Let qi be the probability of reaching 
round i, and note it can be computed from the {pij}. Recall 7r(i, j) is the probability of hiring 
the j'th best so far at round i conditional on seeing the j'th best so far at round i. Thus 
using Bayes' Rule we can derive an efficiently-computable bijective mapping (with an efficiently 
computable inverse) </>(7r) between tt and {pij} which simply sets 7r(z,j) = pij/(qi/i). It only 

^This is a loose lower bound; there are many other instances where tt also wins, e.g., if the second-best candidate 
is in the last n(l — q) candidates and the best occurs after the third best in the first qn candidates. 




We further note tt is a symmetric equilibrium since if an employer deviates and hires early 
then by definition the opponent has a better than 50% chance of getting a better candidate. 
Similarly, if an employer deviates and hires late then by definition his candidate has at most a 
50% chance of being a better candidate than that of his opponent. □ 
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remains to show that one can find a matrix M such that the payoff of a strategy tt versus a 
strategy tt' is 0(7r)*M0(7r'). This is done by calculating the appropriate binomials. 

We show how to apply the bilinear duel framework to compute the equilibrium of the in- 
dependent hiring duel. This requires the following steps: define a subset K of Euclidean space 
to represent strategies, define a bijective mapping between K and feasible (mixed) strategies 
A(X), and show how to represent the payoff matrix of strategies in the bilinear duel space. We 
discuss each step in order. 

Defining K. For each 1 < i < n and j < i we define pij to be the (total) probability of 
seeing and hiring the j'th best candidate seen so far at round i. Our subspace K = [0, 
consists of the collection of probabilities {pij}. To derive constraints on this space, we introduce 
a new variable qi representing the probability of reaching round i. We note that the probability 
of reaching round (« + 1) must equal the probability of reaching round i and not hiring, so that 
Qi+i = ft — Sj=i P^i ■ Furthermore, the probability pij can not exceed the probability of reaching 
round i and interviewing the j'th best candidate seen so far. The probability of reaching round 
i is Qi by definition, and the probability that the projected rank of the i'th candidate is j is 1/i 
by our choice of a uniformly random permutation. Thus pij < Qi/i. Together with the initial 
condition that q.^ = 1, these constraints completely characterize K. 

Mapping. Recall a strategy tt indicates for each i and j < i the conditional probability 
of making an offer given that the employer is interviewing the i'th candidate and his projected 
rank is j whereas pij is the total probability of interviewing the i'th candidate with a projected 
rank of j and making an offer. Thus 7r(i,j) = pij/{qi/i) and so pij = qiTT{i, j)/i. Together 
with the equailities derived above that qi = 1 and qi+i = qi — J^j^iPij, we can recursively 
map any strategy tt to K efficiently. To map back we just take the inverse of this bijection: 
given a point {pij} in K, we compute the (unique) qi satisfying the constraints qi = 1 and 
qi+i = qi - YJj=i Pij^ and define 7r(i, j) = pij/{qi/i). 

Payoff Matrix. By the above definitions, for any strategy tt and corresponding mapping 
{pij}, the probability that the strategy hires the j'th best so far on round i is pij. Given that 
employer A hires the j'th best so far on round i and employer B hires the j"th best so far 
on round i' , wc define Miji'ji to be the probability that the overall rank of employer ^'s hire 
beats that of employer _B's hire plus one-half times the probability that their ranks are equal. 
We can derive the entries of the this matrix as follows: Let be the event that with respect 
to permutation ax the overall rank of a fixed candidate is r, and be the event that the 
projected rank of the last candidate in a random prefix of size i is j. Then 

r,r' •.l<.r<r' <.7i l<r<n 

Furthermore, by Bayes rule, Pr[£;;^|^;^] = Pj:[F^^ \E^]Pt[E^]/ Pt[F^^] where Pr[£;^] = 1/n 
and Pr[i^f ] = 1/i. To compute Pi[F{^\E^], we select the ranks of the other candidates in the 
prefix of size i. There are (^Zi) ways to pick the ranks of the better candidates and (""-j^^) 
ways to pick the ranks of the worse candidates. As there are {"ZD ways overall to pick the 
ranks of the other candidates, we see: 

Letting {pij} be the mapping 4'{tt) of employer A's strategy tt and {p'ij} be the mapping 4'{tt) 
of employer i?'s strategy tt' , we see that c{tt,tt') = 0(7r)*M0(7r'), as required. 

By the above arguments, and the machinery from Section 12.11 we have proven Theorem [T] 
which claims that the equilibrium of the independent hiring duel is computable. 
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4 Compression Duel 



In a compression duel, two competitors each choose a binary tree with leaf set 51. An clement 
oj G is then chosen according to distribution p, and whichever player's tree has ui closest 
to the root is the winner. This game can be thought of as a competition between prefix-free 
compression schemes for a base set of words. The Huffman algorithm, which repeatedly pairs 
nodes with lowest probability, is known to be optimal for single-player compression. 

The compression duel is 17, c, p), where $7 = [n] and X is the set of binary trees with 

leaf set Vl. For T E X and w € 0, c(T,uj) is the depth of ui in T. In Section l473l we consider a 
variant in which not every element of Q must appear in the tree. 

4.1 Computing an equilibrium 

The compression duel can be represented as a bilinear game. In this case, K and K' will be sets 
of stochastic matrices, where a matrix entry {a;^ } indicates the probability that item is placed 
at depth j. The set K is precisely the set of probability distributions over node depths that are 
consistent with probability distributions over binary trees. We would like to compute minmax 
optimal algorithms as in Section 12. 2[ but we do not have a randomized rounding scheme that 
maps elements of K to binary trees. Instead, following Section 12.31 we will find approximate 
minmax strategies by constructing an e-best response oracle. 

The mapping (j) : X ^ K is straightforward: it maps a binary tree to its depth profile. Also, 

the expected payoff of x & K versus x' G K' is J2iPi^) J2j (l^ij + J2k>j ^'y) which can 
be written as x^Mx' where matrix M has bounded entries. To apply Lemma [21 we must now 
provide an e best response oracle, which we implement by reducing to a knapsack problem. 

Fix p and x' G K' . We will reduce the problem of finding a best response for x' to the 
multiple-choice knapsack problem (MCKP), for which there is an FPTAS ^3]. In the MCKP, 
there are n lists of items, say {(pm, . . . ,aiki) \ I < i < n}, with each item ay having a value 
Vij > and weight Wij > 0. The problem is to choose exactly one item from each list with total 
weight at most 1, with the goal of maximizing total value. Our reduction is as follows. For each 

uJi & Q and < j < n, define Wij = 2^^ and v^j ~ p{LOi) {^^'ij + X]d>j" ^id) • This defines a 

MCKP input instance. For any given t g X, v{4>{t),x') = Y.uj,<^n'^idt{i) ^-^d Y.i^im'^^Aii) < 1 
by the Kraft inequality. Thus, any strategy for the compression duel can be mapped to a solution 
to the MCKP. Likewise, a solution to the MCKP can be mapped in a value-preserving way to 
a binary tree t with leaf set il, again by the Kraft inequality. This completes the reduction. 

4.2 Beatability 

We will obtain a bound of 3/4 on the beatability of the Huffman algorithm. The high-level idea 
is to choose an arbitrary tree T and consider the leaves for which T beats H and vice-versa. We 
then apply structural properties of trees to limit the relative sizes of these sets of leaves, then 
use properties of Huffman trees to bound the relative probability that a sampled leaf falls in 
one set or the other. 

Before bounding the beatability of the Huffman algorithm in the No Fail compression model, 
we review some facts about Huffman trees. Namely, that nodes with lower probability occur 
deeper in the tree, and that siblings are always paired in order of probability (see, for example, 
page 402 of Gersting [S]. In what follows, we will suppose that iJ is a Huffman tree. 

Fact 1. Ifdnivi) > then ph{vi) <Ph{v2)- 

Fact 2. If Vi and V2 are siblings with Ph{vi) < Ph{v2), then for every node W3 G iJ either 
PhM <Ph{vi) orpnivs) >ph{v2)- 

We next give a bound on the relative probabilities of nodes on any given level of a Huffman 
tree, subject to the tree not being too "sparse" at the subsequent (deeper) level. Let p^™[d) = 
^'^nv:dHiv)=dPH{v) andp^"^(d) = ma,x^,a^(^^)^dPH(v). 
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Lemma 5. Choose any d < max^, dniv) and nodes v,w such that dniw) ~ dniv) ~ d. If v is 
not the common ancestor of all nodes of depth greater than d, then Ph{w) < 3pi/(w). 

Proof. Let a = ph{v). By assumption there exists a non-leaf node z ^ v with dniz) = d, say 
with children zi and Z2. Then pnizi) < a and ph{z2) < a by Fact [1] so ph{z) < 2a. This 
implies that w's sibling has probability at most 2a by Fact [21 so the parent of v has probability 
at most 3a. Fact [T] then implies that ph{w) < 3a as required. □ 

For any T e X and set of nodes R C T we define the weight of R to be wt{R) = 
SuG-R "^^^ Kraft inequality for binary trees is wt{T) < 1. In fact, we have wriT) = 1 

since we can assume each interior node of T has two children. 

Lemma 6. Choose R <Z H .such that no node of R is a descendent of any other, and suppose 
wiR) = 2-'^ for some d e [n]. Then p^g'^id) < p{R) < 

Proof. We will show p{R) < PH^'^id)', the argument for the other inequality is similar. We 
proceed by induction on \R\. If \R\ = 1 the result is trivial (since R = {v} where dniv) ~ d). 
Otherwise, since w{R) — 2"*^, there must be at least two nodes of the maximum depth present 
in R. Let v and w be the two such nodes with smallest probability, say with ph{v) < pniw). 
Let w' be the parent of w. Then pniw') > PHi^) +Ph{v), since the sibling of w has weight 
at least PhW) by Fact [51 Also, w' ^ R since w ^ R and no node of i? is a descendent of any 
other. Let R' ^ RU {w'} - {w, v}. Then w{R') = w{R), p{R') > p{R), and no node of R' is a 
descendent of any other. Thus, by induction, p{R) < p{R') < p'^°'^{d) as required. □ 

We are now ready to show that the beatability of the Huffman algorithm is at most |. 
Proposition 2. The beatability of the Huffman algorithm is at most |. 

Fix Vt and p. Let H denote the Huffman tree and choose any other tree T. Define P = {w G 
: driv) < dH{v)}, Q = {v & ^ : driv) > d// (?;)}. That is, P is the set of elements of for 
which T beats H, and Q is the set of elements for which H beats T. Our goal is to show that 
p{P) < 3p{Q), which would imply that v(T,H) < 3/4. 

We first claim that w{P) < w{Q). To see this, write U = 51 — (P U Q) and note that, by the 
Kraft inequality, 

w{P) + w{Q) + w{U) = 1 = wt{P) + wt{Q) + wt{U). (2) 

Moreover, wt{Q) > 0, wt{U) = wh{U), and wt{P) > 2w{P) (since driv) < dniv) - 1 for ah 
V & P). Applying these inequalities to © implies w{P) — w{Q) < 0, completing the claim. 

Our approach will be to express P and Q as disjoint unions P = Pi U . . . U Pr and Q = 
Qi U . . . U Qr such that p{Pi) < 3p{Qi) for aU i. To this end, we express the quantities w{P) and 
w{Q) in binary: choose a;i, . . . , x„ and yi, . . . , y„ from {0, 1} such that w{P) = ^ - Xi2~^ and 
''Jj{Q) = ^iyi^~^- Since w{P) is a sum of element weights that are inverse powers of two, we 
can partition the elements of P into disjoint subsets Pi, . . . , P„ such that w{Pi) = Xi2~^ for all 
i G [n]. Similarly, we can partition Q into disjoint subsets Qi, . . . , Qn such that w{Qi) = yi2~^ 
for all I £ [n]. 

Let r ~ min{i : Xi ^ yi}. Note that, since w{P) < w{Q), we must have a;^ = and y^ = 1. 

We first show that p{Pi) < 3p{Qi) for each i < r. Since Xi = yi, we either have Pi = Qi = or 
else w{Pi) = w[Qi) = 2~*. In the latter case, suppose first that \Qi\ = 1. Then, since Qi consists 
of a single leaf and i is not the maximum depth of tree H , we can apply Lemmal^land Lemma[5lto 
conclude p[Pi) < p'^°'^{i) < 3p{Qi). Next suppose that \Qi\ > 1. We would again like to apply 
Lemma [5l but we must first verify that its conditions are met. Suppose for contradiction that 
all nodes of depth greater than i share a common ancestor of depth i. Then, since w{Qi) ~ 2~* 
and \Qi\ > 1, it must be that Qi contains all such nodes, which contradicts the fact that 
contains at least one node of depth greater than i. We conclude that the conditions of Lemma 
[51 are satisfied for all v and w at depth i, and therefore p{Pi) < p'^°'^{i) < 3p^™(z) < 3p{Qi) as 
required. 
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We next consider i > r. Let P^. = {Jj>r^j ^^'^ Q'r ^ iJj>rQj- '^'^ claim that < 
3p{Q'j.). If = then this is certainly true, so suppose otherwise. Then w{P'.) < 2~'', so P'. 
contains elements of depth greater than r. As in the case i < r, this implies that either Qr 
contains only a single node (and cannot be the common ancestor of all nodes of depth greater 
than r), or else not all nodes of depth greater than r have a common ancestor of depth r. We 
can therefore apply Lemma[n]and Lemma[5]to conclude p{Pl.) < p^°^(r) < 3p{Qr) < 3p{Q'j.). 

Since P = Pi U . . . U Pr-i U P/ and Q ~ Qi U . . . U Qr-i U Q'^. arc disjoint partitions, we 
conclude that p{P) < 3p{Q) as required. □ 

We now give an example to demonstrate that the Huffman algorithm is at least (2/3 — e)- 
beatable for every e > 0. For any n > 3, consider the probability distribution given by p{uji) — ^, 
p(cjj) = ■^,2i-i for all 1 < i < n, and p(w„) = . For this distribution, the Huffman tree t 

satisfies dt{uji) = i for each i < n and dt{u}n) = n — I. Consider the alternative tree <' in which 
d(wi) = n — 1 and = i — 1 for all i > 1. Then t' will win if any of 0^2,^3, . . • ,uJn-i are 

chosen, and will tie on a;„. Thus v{t' , t) = Yliyi 3-2'-'^ 1 ' 3.2"--' = f ^ 3-2"-'^ ' hence the 
Huffman algorithm is (| — 3.2^-2 )-beatable for every n > 3. 

We conclude the section by noting that if all probabilities are inverse powers of 2, the Huffman 
algorithm is minmax optimal. 

Proposition 3. Suppose there exist integers ai, . . . ,a„ such that p{uJi) = 2~°' for each i < n. 
Then the value of the Huffman tree H is v{H) = 1/2. 

Proof. We suppose that there exist integers ai, . . . , a„ such that p{uJi) = 2~°' for each i < n. 
Our goal is to show that the value of the Huffman tree H is v{H) = 1/2. 

For this set of probabilities, the Huffman tree will set dnioJi) — at for all G il. In this case, 
p{R) = w{R) for all R C H. Choose any other tree T, and define sets P and Q as in the proof 
of Proposition [2j That is, P is the set of elements of 57 for which T beats H, and Q is the set 
of elements for which H beats T. Then, as in Proposition [2l we must have w{P) < w{Q), and 
hence p{P) < p{Q)- Thus v{H,T) < 1/2. We conclude that the best response to the Huffman 
tree H must be H itself, and thus strategy H has a value of 1/2. □ 

4.3 Variant: allowed failures 

We consider a variant of the compression duel in which an algorithm can fail to encode certain 
elements. If we write L{T) to be the set of leaves of binary tree T, then in the (original) model 
of compression we require that L{T) = ft for allT G X, whereas in the "Fail" model we require 
only that L{T) C ft. If w ^ L{T), we will take c{T,u!) = 00. The Huffman algorithm is optimal 
for single-player compression in the Fail model. 

We note that our method of computing approximate minmax algorithms carries over to this 
variant; we need only change our best-response reduction to use a Multiple-Choice Knapsack 
Problem in which at most one element is chosen from each list. What is different, however, is 
that the Huffman algorithm is completely beatable in the Fail model. If we take ft = {101,002} 
with p{uji) = 1 and p{uj2) = 0, the Huffman tree H places each of the elements of il at depth 2. 
If T is the singleton tree that consists of uji as the root, then v{T, H) = 1. 

5 Binary Search Duel 

In a binary search duel, O = [n] and X is the set of binary search trees on f2 (i.e. binary trees 
in which nodes are labeled with elements of Vt in such a way that an in-order traversal visits 
the elements of Q, in sorted order). Let p be a distribution on 57. Then for T G X and G fi, 
c{T,uj) is the depth of the node labeled by "w" in the tree T. In single-player binary search 
and uniform p, selecting the median m element in 57 as the root node and recursing on the left 
{uj\io < m} and right {lu\uj > m} subsets to construct sub-trees is known to be optimal. 

The binary search game can be represented as a bilinear duel. In this case, K and K' will 
be sets of stochastic matrices (as in the case of the compression game) and the entry {xij} 
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will represent the probability that item ujj is placed at depth i. Of course, not every stochastic 
matrix is realizable as a distribution on binary search trees (i.e. such that the probability Uj is 
placed at depth i is {xi^j}). In order to define linear constraints on K so that any matrix in K is 
realizable, we will introduce an auxiliary data structure in Section [5TT] called the State-Action 
Structure that captures the decisions made by a binary search tree. Using these ideas, we will 
be able to fit the binary search game into the bilinear duel framework introduced in Section [52] 
and hence be able to efficiently compute a Nash equilibrium strategy for each player. 

Given a binary search tree T S X, we will write ct(w) for the depth of w in T. We will also 
refer to ct(w) as the time that T finds lj. 

5.1 Computing an equilibrium 

In this subsection, we give an algorithm for computing a Nash equilibrium for the binary search 
game, based on the bilinear duel framework introduced in Section 12.21 We will do this by 
defining a structure called the State-Action Structure that we can use to represent the 
decisions made by a binary search tree using only polynomially many variables. The set of valid 
variable assignments in a State- Action Structure will also be defined by only polynomially 
many linear constraints and so these structures will naturally be closed under taking convex 
combinations. We will demonstrate that the value of playing a G A(X) against any value matrix 
V - see Definition [1] is a hnear function of the variables in the State-Action Structure 
corresponding to a. Furthermore, all vahd State-Action Structures can be efficiently 
realized as a distribution on binary search trees which achieves the same expected value. 

To apply the bilinear duel framework, we must give a mapping </> from the space of binary 
search trees to a convex set K defined explicitly by a polynomial number of linear constraints 
(on a polynomial number of variables). We now give an informal description of K: The idea 
is to represent a binary search tree T G X as a layered graph. The nodes (at each depth) 
alternate in type. One layer represents the current knowledge state of the binary search tree. 
After making some number of queries (and not yet finding the token), all the information that 
the binary search tree knows is an interval of values to which the token is confined - we refer to 
this as the live interval. The next layer of nodes represents an action - i.e. a query to some item 
in the live interval. Correspondingly, there will be three outgoing edges from an action node 
representing the possible replies that either the item is to the left, to the right, or at the query 
location (in which case the outgoing edge will exit to a terminal state). 

We will define a flow on this layered graph based on T and the distribution p on 57. Flow will 
represent total probability - i.e. the total flow into a state node will represent the probability 
(under a random choice of a; g f2 according to p) that T reaches this state of knowledge (in 
exactly the corresponding number of queries). Then the fiow out of a state node represents a 
decision of which item to query next. And lastly, the fiow out of an action node splits according 
to Bayes' Rule - if all the information revealed so far is that the token is confined to some 
interval, we can express the probability that (say) our next query to a particular item finds the 
token as a conditional probability. We can then take convex combinations of these "basic" fiows 
in order to form fiows corresponding to distributions on binary search trees. 

We give a randomized rounding algorithm to select a random binary search tree based on a 
flow - in such a way that the marginal probabilities of finding a token uii at time r are exactly 
what the flow specifies they should be. The idea is that if we choose an outgoing edge for each 
state node (with probability proportional to the fiow), then we have fixed a binary search tree 
because we have specified a decision rule for each possible internal state of knowledge. Suppose 
we were to now select an edge out of each action node (again with probability proportional to 
the flow) and we were to follow the unique path from the start node to a terminal node. This 
procedure would be equivalent to searching for a randomly chosen token uJi chosen according to 
p and using this token to choose outgoing edges from action nodes. This procedure generates 
a random path from the start node to a terminal node, and is in fact equivalent to sampling a 
random path in the path decomposition of the fiow proportionally to the flow along the path. 
Because these two rounding procedures are equivalent, the marginal distribution that results 
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from generating a binary search tree (and choosing a random element to look for) will exactly 
match the corresponding values of the flow. 



5.2 Notation 

The natural description of the strategy space of the binary search game is exponential (in \Q\) 
- so we will assume that the value of playing any binary search tree T against an opponent's 
mixed strategy is given to us in a compact form which we will refer to as a value matrix: 

Definition 1. A value matrix V is an \Q\ x matrix in which the entry Vij is interpreted 
to be the value of finding item ujj at time i. 

Given any binary search tree T' G X, we can define a value matrix V{T') so that the 
expected value of playing any binary search tree T € X against T in the binary search game 
can be written as J^i.j 1ctK)=»^(^')»,j : 

Definition 2. Given a binary search tree T' G X , let V{T') be a value matrix such that 

{0 if CT'i'^j) < i 
\ lfCT,{LUj)^i 
1 ifcT'iujj) > i 

Similarly, given a mixed strategy a' £ A{X), let V{a') — ET'^a'[V{T')] 

Note that not every value matrix V can be realized as the value matrix V{T') for some 
T' G X. In fact, V need not be realizable as V{a) for some a G A(X). However, we will be 
able to compute the best response against any value matrix V, regardless of whether or not 
the matrix corresponds to playing the binary search game against an adversary playing some 
mixed strategy. Lastly, we define a stochastic matrix I{T), given T G X. From /(T), and 
V{T') we can write the expected value of playing T against T' as a inner-product. We let 
< A, B >p= j ^i,jBijp{ujj) when A and B are \rt\ x |fi| matrices. 

Definition 3. Given a binary search tree T G X, let I{T) be an \VL\ x matrix in which 
I(T)i_j = lcT(ujj)=i- Similarly, given a G A(X), let I{a) ~ £^t~(t[^(7')]- 

Lemma 7. Given a, a' G A{X), the expected value of playing a against a' in the binary search 
game is exactly < I{a),V{a') >p. 

Proof. Consider any T, T' G X. Then the expected value of playing T against T in the binary 
search game is exactly J^iPi^^) lcT(i^i)<CT'(".) + |1ctK)=ct/K) =< I{T),V{T') >p. And 
since < /(T), V{T') >p is bilinear in the matrices I{T) and V{T'), indeed the expected value of 
playing a against a' is < I{a), V{a-') >p. □ 



5.3 State- Action Structure 

Definition 4. Given a distribution p on and uji,ujj,ujk G Q (and ai < j < k), let 

-P'^cjj./ ~p ^ fc < fc] ^ Ptuji^, r~jp[k = k] J R Pfuj^i ~p[k K k < j] 

^ Pr^„~p[*<fc'< j]'^''^-' " P^^^^JZ;^^k^r """^P^'^^" = Pr^^,^p[i<k' <j] 

Intuitively, we can regard the interval [uji,ujj] as being divided into the sub-intervals [uji, ujk-i], 
{cjfc} and [a;fc_|_i,a;j]. Then the quantity pf'j^. represents the probability that randomly gener- 
ated element is contained in the first interval, conditioned on the element being contained in 
the original interval [0;^,^^]. Similarly, one can interpret pfj^. and pfj^. as being conditional 
probabilities as well. 

We also define a set of knowledge states, which represent the current information that the 
binary search tree knows about the element and also how many queries have been made: 

Definition 5. We define: 
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1. S ^ {{ij,r)\uj,,ujj e n,i < j, andr e {l,2,....,\il\}} 

2. A = {{S,k)\S = e S,iOk en andke (ij)} 

3. T={{k,r)\ujk en andre {1,2, ....,\n\}} 

We will refer to S as the set of knowledge state. Additionally we will refer to Sstart ~ (1^1,1^11,0) 
as the start state. We will refer to A as the set of action state and J- as the set of termination 
states. 

We can now define a State- Action Structure: 
Definition 6. A State- Action Structure is a fixed directed graph generated as: 

1. Create a node ng for each S e S, a node for each A e A and a node np for each 
FeF. 

2. For each S = {i,j,r) e S, and for each k such that i < k < j, create a directed edge es,k 
from S to A= {S, k) e A. 

3. For each A = {S,k) e A and S ~ {i,j,r), create a directed edge ca.f from A to F ^ 
{k,r 1) and directed edges eA,SL '^'^^ ba.Sr from A to Sl and Sr respectively for Sl = 
{i,k — l,r + 1) and Sr = (fc 4- 1, j, r -|- 1). 

We will define a flow on this directed graph. The source of this flow will be the start node 
Sstart and the node corresponding to each termination state will be a sink. The total flow in this 
graph will be one unit, and this flow should be interpreted as representing the total probability 
of reaching a particular knowledge state, or performing a certain action. 

Definition 7. We will call an set of values Xg for each directed edge in a State-Action 
Structure a stateful flow if (let us adopt the notation that xs,a is the flow on an edge es,A)'- 

1. For all e, < < 1 

2. All nodes except ns^^^^t and up (for F e J-) satisfy conservation of flow 

3. For each action state A ~ iS,i) e A for S = {i,j,r), the the flow on the three out- 
going edges eA,F,eA,SL "-""-d ^a^Sr from ua, satisfy xa,f = pfj^k^, xa,Sl = Pi,j,kC '"^^ 
XA,Sn = Pf,j,k ^here C = Ee=iS',A) for s'es^S',A 

Given T e X,wc can define a flow xt in the State- Action Structure that captures the 
decisions made by T: 

Definition 8. Given T e X , define xt as follows: 

1. For each S = {i,j,r) e S let Ti j be the sub-tree of T (if a unique such sub-tree exists) 
such that the labels contained in Tij are exactly {wi, Wi+i, Wj}. Suppose that the root 
of this sub-tree Tij is ujk- Then send all flow entering the node ns on the outgoing edge 
es,A for A = {S, k). 

2. For each A e A, divide flow into a action node ua according to Condition 3 in Definition^ 
among outgoing edges. 

Note that the flow out of ns^^.^^^ is one. Of course, the choice of how to split flow on outgoing 
edges from an action node ua is already well-defined. But wc need to demonstrate that xt does 
indeed satisfy conservation of flow requirements, and hence is a stateful flow: 

Lemma 8. For any T e X . xt is a stateful flow 

Proof. For some intervals {w^, w^+i, there is no sub-tree in T for which the labels con- 

tained in the sub-tree is exactly {uji,iL)i^i, ...,ujj}. If there is such an interval, however, it is 
clearly unique. We will prove by induction that the only state nodes in the State-Action 
Structure which are reached by flow xt are state nodes for which there is such a sub-tree. 

We will prove this condition by induction on r for state nodes ns of the form S = {i,j,r). 
This condition is true in the base case because all flow starts at the node ng^^^^^ and Sstart = 
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(clIi,cli„,0) and indeed the entire binary search tree T has the property that the set of labels 
used is exactly {wi,a;2, ...Wn}. 

Suppose by induction that there is some sub-tree j of T for which the labels of contained 
in the sub-tree are exactly {wi, Wi+i, ...,0;^}. Let oJk be the label of the root node of Tij. Then 
all flow entering ns would be sent to the action node A = {S, k) and all flow out of this action 
node would be set to either a termination node or to state nodes = (i,fc — l,r + 1) or 
Sr = (fc + l,r + l) and both of the intervals {w^, aj2, ...a;,.„i} or {w^+i, •■•7 '^j} do indeed 
have the property that there is a sub-tree that contains exactly each respective set of labels - 
these are just the left and right sub-trees of T^j. □ 

The variables in a stateful flow capture marginal probabilities that we need to compute the 
expected value of playing a binary search tree T against some value matrix V: 

Lemma 9. Consider any state S ~ (i,j,r) G S. The total flow in xt into ns is exactly the 
probability that (under a random choice of uik p), i^k is contained in some sub-tree of T at 
depth r-f 1. Similarly the total flow in xt into any terminal node np for F = (a;/,r) is exactly 
the probability (under a random choice of LOk ^ p) that cxi^Jk) = r. 

Proof. We can again prove this lemma by induction on r for state nodes ng of the form S ~ 
{i,j,r). In the base case, the flow into ns^^^^t is 1, which is exactly the probability that (under 
a random choice of ojt ~ p) , Wt is contained in some sub-tree of T at depth 1 . 

So we can prove the inductive hypothesis by sub-conditioning on the event that the element 
Wfc is contained in some sub-tree of T at depth r. Let this subtree be T'. By the inductive 
hypothesis, this is exactly the flow into the node ns' where S' = {i,j,r — 1) for some uJi, Uj € f2 
and i < k < j. We can then condition on the event that ujk is such that i < k < j. Let cOr be the 
label of the root node of T'. Then using conditioning, the probability that ujk is contained in 
the left-subtree of T' is exactly pfj^r, and similarly for the right sub-tree. Also the probability 
that OJk = ijJr is pfj^r- ^iid so Condition 3 in Definition [7] enforces the condition that the flow 
splits exactly as this total probability splits - i.e. the probability that uJk is contained in the left 
and right sub-interval of {uji,uji^i, ...ojj} or contained in the root "w^" respectively. Note that 
the set of sub-trees at any particular depth in T correspond to disjoint intervals of fi, and hence 
there is no other flow entering the state ns, and this proves the inductive hypothesis. □ 

As an immediate corollary: 
Corollary 1. The expected value of playing T against value matrix V, 

<IiT),V>p= ^T{F)Vr,k 

where x™ denotes the total flow into a node according to xt- 

And as a second corollary: 
Corollary 2. Given T e X, 



^xfp{u;J,i)+J2i>>^ xip{ujj,i') 



p{ujj) 

where x^rpitUj, i) denotes the total flow into np for F = [ujj, i) G T. 
5.4 A rounding algorithm 

Proposition 4. Given a stateful flow x, there is an efficient randomized rounding procedure that 
generates a random T d X with the property that for any ujj G and for any iG{l,2,...,|i7|}, 
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Proof. Since a; is a unit flow from ns^^^^^ to the set of sink nodes n p for F Cz J-. So if we could 
sample a random path proportional to the total flow along the path, the probability that the 
path ends at any sink np for F = {ujj,r) is exactly x™{ujj,r). 

First Rounding Procedure: Consider the following procedure for generating a path ac- 
cording to this distribution - i.e. the probability of generating any path is exactly the flow 
along the path: Starting at the source node, and at every step choose a new edge to traverse 
proportionally to the flow along it. So if the process is currently at some node ng and the total 
flow into the node is U, and the total flow on some outgoing edge e is u, edge e is chosen with 
probability exactly jj and the process continues until a sink node is reached. Notice that this 
procedure always terminates in 0(|r2|) steps because each time we traverse an action node ua, 
the counter r is incremented and every edge in a State- Action Structure cither points into 
or points out of a action node. 

The key to our randomized rounding procedure is an alternative way to generate a path 
from the source node to a sink such that the probability that the path ends at any sink np for 
F = {ujj,r) is still exactly x*"(wj,r). Instead, for each state node ns, we choose an outgoing 
edge in advance (to some action node) proportional to the flow on x on that edge. 

Second Rounding Procedure: If we fix these choices in advance, we can define an alter- 
nate path selection procedure which starts at the source node, and traverse any edges that have 
already been decided upon. Whenever the process reaches an action node (in which case the 
outgoing edge has not been decided upon), we can select an edge proportional to the total flow 
on the edge. This procedure still satisfies the property that the probability that the path ends 
at any sink np for F = {LUj,r) is exactly x*"(a;j,r). 

Third Rounding Procedure: Next, consider another modification to this procedure. 
Imagine still that the outgoing edges from every state node are chosen (randomly, as above in 
the Second Rounding Procedure: ). Instead of choosing which outgoing edge to pick from 
an action node when we reach it, we could instead pick an item uJk' ~ p in advance and using this 
hidden value to determine which outgoing edge from a action node to traverse. We will maintain 
the invariant that if we are at ua and A = {S,k) for S ~ {i,j,r), we must have i < k' < j. 
This is clearly true at the base case. Then we will traverse the edge eA,F for F = {k,r) if 
UJk' = oJk- Otherwise if i < fc' < fc — 1 we will traverse the edge eA,SL fo'" = («, fc — 1, r + 1). 
Otherwise i < k' < k — 1 and we will traverse the edge ca.Sr for Sr = {k + 1, j, r + 1). This 
clearly maintains the invariant that k' is contained in the interval corresponding to the current 
knowledge state. 

This third procedure is equivalent to the second procedure. This follows from interpreting 
Condition 3 in Definition [7] as a rule for splitting fiow that is consistent with the conditional 
probability that LUk' is contained in the left or right sub- interval of {wi,a;i+i, ■■■UJj} or is equal 
to UJk conditioned on ujk' G {w^, w^+i, ...Wj}. An identical argument is used in the proof of 
Lemma |9l In this case, we will say that uJk' is the rule for choosing edges out of action nodes. 

Now we can prove the Lemma: The key insight is that once we have chosen the outgoing 
edges from each state node (but not which outgoing edges from each action node), we have 
determined a binary search tree: Given any element uJk' , if we follow outgoing edges from action 
nodes using ujk' as the rule, we must reach a terminal node F = {ujk' , r) for some r. In fact, the 
value of r is determined by uJk' because once ujk' is chosen, there are no more random choices. So 
we can compute a vector of dimension |ri|, u such that Uj = r such that F = (coj^r) is reached 
when the cjj is the rule for choosing edges out of action nodes. 

Using the characterization in Proposition [HI it is easy to verify that the transition rules in 
the State Action Structure enforce that u is a depth vector and hence we can compute a 
binary search tree T which has the property that using selection rule ujj results in reaching the 
sink node F = {ujj, cpi^Jj)). 

Suppose we select each outgoing edge from a state node (as in the Third Rounding Pro- 
cedure) and select an uJk' ^ p (again as in the Third Rounding Procedure) independently. 
Then from the choices of the outgoing edges from each state node, we can recover a binary 
search tree T. Then PrT.cj^, [cT('^fc') = r] = x'"(a;fc/,r) precisely because the First Rounding 
Procedure and the Third Rounding Procedure are equivalent. And then we can apply 
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Bayes' Rule to compute that 

a;™(wA;,r) 



PrricriuJk') = r\uJk' = Wfe] = 



□ 



Theorem 5. There is an algorithm that runs in time polynomial in \ that computes an exact 
Nash equilibrium for the binary search game. 



Proof. We can now apply the bilicar duel framework introduced in Section 12.21 to the binary 
search game: The space K is the set of all stateful flows. The set of variables is polynomially 
sized - see Definition [SI and the set of linear constraints is also polynomially sized and is given 
explicitly in Definition [T) The function (j) maps binary search trees T € X to a. stateful flow xt 
and is the procedure given in DefLntion[8]for computing this mapping is efficient. Also the payoff 
matrix M is given explicitly in Corollary [1] and Corollary [2j And lastly we give a randomized 
rounding algorithm in Proposition |4l □ 



5.5 Beatability 

We next consider the beatability of the classical algorithm when p is the uniform distribution 
on n. For lack of a better term, let us call this single-player optima the median binary search - 
or median search. 

Here we give matching upper and lower bounds on the beatability of median search. The 
idea is that an adversary attempting to do well against median search can only place one item 
at depth 1, two items at depth 2, four items at depth 3 and so on. We can regard these as 
budget restrictions - the adversary cannot choose too many items to map to a particular depth. 
There are additional combinatorial restrictions, as well For example, an adversary cannot place 
two labels of depth 2 both to the right of the label of depth 1 - because even though the root 
node in a binary search tree can have two children, it cannot have more than one right child. 

But suppose we relax this restriction, and only consider budget restrictions on the adver- 
sary. Then the resulting best response question becomes a bipartite maximum weight matching 
problem. Nodes on the left (in this bipartite graph) represent items, and nodes on the right 
represent depths (there is one node of depth 1, two nodes of depth 2, ...). And for any choice 
of a depth to assign to a node, we can evaluate the value of this decision - if this decision beats 
median search when searching for that element, we give the corresponding edge weight 1. If it 
ties median search, we give the edge weight ^ and otherwise we give the edge zero weight. 

We give an upper bound on the value of a maximum weight matching in this graph, hence 
giving an upper bound on how well an adversary can do if he is subject to only budget re- 
strictions. If we now add the combinatorial restrictions too, this only makes the best response 
problem harder. So in this way, we are able to bound how much an adversary can beat median 
search. In fact, we give a lower bound that matches this upper bound - so our relaxation did 
not make the problem strictly easier (to beat median search). 

We focus on the scenario in which = 2'' — 1 and p is the uniform distribution. Throughout 
this section we denote n = The reason we fix n to be of the form 2'' — 1 is because the 
optimal single-player strategy is well-defined in the sense that the first query will be at precisely 
the median element, and if the element w is not found on this query, then the problem will break 
down into one of two possible 2'""^ — 1 sized sub-problems. For this case, we give asymptotically 
matching upper and lower bounds on the beatability of median search. 

Definition 9. We will call a \ fl\-dimensional vector u over {1, 2, ...|f2|} a depth vector (over 
the universe ^) if there is some T E X such that Uj = CT{ujj). 

Proposition 6. A \U,\-dimensional vector u over {\, 2^ ...\U,W is a depth vector (over the universe 
n) if and only if 

1. exactly one entry of u is set to 1 (let the corresponding index be j), and 
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2. the vectors [ui — 1, U2 — 1, ....itj_i — 1] and [uj+i — 1, — 1, •••■u„ — 1] are depth vectors 
over the universe {wi,a;2, and {u}j+i,L0j^2T--^n} respectively. 

Proof. Given any vector u that (recursively) satisfies the above Conditions 1 and 2, one can 
build up a binary search tree on D, inductively. Let coj S f2 be the unique item such that 
Uj = 1 which exists because u satisfies Condition 1. Since u satisfies Condition 2, the vectors 
ul = [ui — l,U2 — 1, ....Uj-i — 1] and u/j = [%"+! ^ 1, Ujj^2 — 1, ~ 1] and hence by induction 
we know that there are binary search trees Tj, and on the universe {wi, a;2, ■•■Wj-i} and 
Wj_|_2, ...w„} respectively for which = CTj^(uji) and UR{i') = CT^iuJi') for each 1 < 

i < j — 1 and j + I < i' < n respectively. 

So we can build a binary search tree T on f2 by labeling the root node Ljj and letting the 
left sub-tree to and the right sub-tree to Tr. Since the in-order traversal of Tl and of Tr 
result in visiting {cji, W2, ■••Wj-i} and {ujj^i,ujj^2, ■■■'-^n} in sorted order, the in-order traversal 
of T will visit n in sorted order and hence T £ X. 

Not also that CT(wi) = 1-1- ct^ (j-^i) for 1 < i < j — 1 and similarly crii^i') = 1 + ct^ {uji') for 
J + 1 < i' < n. So this implies that u satisfies Ui = crii^i) for all 1 < i < n, as desired. This 
completes the inductive proof that if a vector u satisfies Conditions 1 and 2, then it is a depth 
vector. 

Conversely, given T G X , there is only one element tUj such that ct {i^j ) = 1 and so Condition 
1 is met. Let and be the binary search trees that are the left and right sub-tree of T rooted 
at ujj respectively, where "wj" is the label of the root node in T. Again, CT{uJi) = 1 -I- c^j^ (w^ ) for 
1 < i < J — 1 and similarly CT(wi') = 1 + CTj, (wi') for j + 1 < i' < n so the vector corresponding 
to ct does indeed satisfy Condition 2 by induction. □ 

Claim 1. For any depth vector u, and any s G {1, 2, ...|r2|}, 

\{j E [n]\ such that uj = s}| < 2"*^-^ 

Lemma 10. The beatability of median search is at least ~ f ■ 

Proof. Consider the depth vector for median search for 2'^ — 1 (r = 3): [3,2,3,1,3,2,3] and 
consider a partially filled vector [2, 1, *, *, 2, *, *]. We can generate the depth vector for median 
search for r + 1 from the depth vector for median search for r as follows: alternately interleave 
values of r + 1 into the depth vector for r. For example the depth vector for median search 
for r = 4 is [4,3,4,2,4,3,4,1,4,3,4,2,4,3,4]. We assume by induction that aU blocks in the 
partially filled vector are either *s or are one less than the corresponding entry in the depth 
vector for median search. This is true by induction for the base case r = 3. We also assume that 
the *s are given in blocks of length exactly two. This is also true in the base case. Then if we 
consider the depth vector for median search for r + 1, if an entry of r -I- 1 is interleaved, we can 
place a value of r if the corresponding entry in the partially filled vector is interleaved between 
two entries that are already assigned numbers. Otherwise three entries are interleaved into a 
string of exactly two *s. The median entry in this string of 5 symbols corresponds to a newly 
added r + 1 entry in the depth vector for median search. At the median of this 5 symbol string, 
we can place a value of r. This again creates sequences of *s of length exactly two, because we 
have replaced only the median entry in the string of 5 symbols. 

If we are given a partially filled depth vector with the property that one value 1 is placed, 
two values of 2 are placed, four values of 3 are placed,... and 2''~^ values of r arc placed. 
Additionally, we require that all unfilled entries (which are given the value * for now) occur in 
blocks of length exactly 2. Then we can fill these symbols with the values r + 1 and r + 2, such 
that the value of r -I- 1 aligns with a corresponding value of r -I- 1 in the depth vector for median 
search (precisely because any two consecutive symbols contain exactly one value of r + 1 in the 
depth vector corresponding to median search for r + 1). 

We can use Proposition |6] to prove that this resulting completely filled vector is indeed a 
depth vector. How much does this strategy beat median search? There are 2*" — 1 locations 
(i.e. every index in which a value of 1, 2, ... or r is placed) in which this strategy beats median 
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search. And there are T' locations in which this strategy ties median search. Note that this 
is for 2''+^ — 1 items, and so the beatabihty of median search on 2'' — 1 items is exactly 

2'-^ -1 + 2'-^ 5 

lim 



r->oo 2'- - 1 8 



Lemma 11. The beatability of median search is at most 



□ 



Proof. One can give an upper bound on the beatability of median search by relaxing the question 
to a matching problem. Given a universe fl of size 2'' — 1, consider the following weighted 
matching problem: For every value of s S {1, 2, ...r — 1}, add 2^~^ nodes on both the left and 
right side with label "s". For any pair of nodes a, b where a is contained on the left side, and b 
is contained on the right side, set the value of the edge connecting a and b to be equal to if 
the label of a is strictly smaller than the label of b, \ if the two labels have the same value, and 
1 if the label of a is strictly larger than the label of 6. 

Let M be the maximum value of a perfect matching. Let M be the average value - i.e. 2^-1 ■ 

Claim 2. M is an upper bound on the beatability of binary search. 

Proof. For any s G {1, 2, ...r — 1}, the depth vector u{M) corresponding to median search has 
exactly 2*~^ indices j for which u{M)j = s. 

We can make an adversary more powerful by allowing the adversary to choose any vector 
u which satisfies the condition that for any s G {l,2,...|r2|}, the number of indices j for which 
Uj = s is at most 2*~^ because using Claim [T] this is a weaker restriction than requiring the 
adversary to choose a vector u that is a depth vector. So in this case, the adversary may as well 
choose a vector u that satisfies the constraint in Claim [T] with equality. 

And in this case where we allow the adversary to choose any vector u that satisfies Claim [l] 
the best response question is exactly the matching problem described above - because for each 
entry in because the adversary only needs to choose what label sG{l,2,...r — l}to place 
at this location subject to the above budget constraint that at most 2*~^ labels of type "s" are 
used in total. □ 



Claim 3. M < 



2''-'- -1+2''- 



Proof. Given a maximum value, bipartite matching problem, the dual covering problem has 
variables ?/„ corresponding to each node v, and the goal is to minimize yy subject to the 
constraint that for every edge {u,v) in the graph (which has value u)), the dual variables 
satisfy ?/„ + yy > w(u, v) and each variable yy is non-negative. 

So we can upper bound M by giving a valid dual solution. This will then yield an upper 
bound on M and consequently will also give an upper bound on M . 

Consider the following dual solution: For each node on the right, with label "s" for s < r — 2, 
set yy equal to 1. For a node on the right with label "s" for s = r — 2, set yy equal to i and 

7' — 1. set yy = 0. Additionally, for every node on the left, only nodes 
r — 1 are given non-zero dual variable, and set this variable equal to i. 
The value of the dual Y^^yy is 1 + 2+ ...2"-^ + ^2*^-2 + i2'-i. And so this yields an upper 

bound on M of ^Z'^t^'^ " and 



2'-! - 1 + 2''-3 5 
lim 



8 

□ 
□ 
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6 Conclusions and Future Directions 



The dueling framework presents a fresh way of looking at classic optimization problems through 
the lens of competition. As we have demonstrated, standard algorithms for many optimization 
problems do not, in general, perform well in these competitive settings. This leads us to suspect 
that alternative algorithms, tailored to competition, may find use in practice. We have adapted 
linear programming and learning techniques into methods for constructing such algorithms. 

We have only just begun an exploration of the dueling framework for algorithm analysis; 
there are many open questions yet to consider. For instance, one avenue of future work is to 
compare the computational difficulty of solving an optimization problem with that of solving the 
associated duel. We know that one is not consistently more difficult than the other: in Appendix 
IB] we provide an example in which the optimization problem is computationally easy but the 
competitive variant appears difficult; an example of the opposite situation is given in Appendix 
[Cl where a computationally hard optimization problem has a duel which can be solved easily. 
Is there some structure underlying the relationship between the computational hardness of an 
optimization problem and its competitive analog? 

Perhaps more importantly, one could ask about performance loss inherent when players 
choose their algorithms competitively instead of using the (single-player) optimal algorithm. 
In other words, what is the price of anarchy |12j of a given duel? Such a question requires 
a suitable definition of the social welfare for multiple algorithms, and in particular it may be 
that two competing algorithms perform better than a single optimal algorithm. Our main open 
question is: does competition between algorithms improve or degrade expected performance? 
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A Proofs from Section [2] 



Here we present the proof of Lcmma[21 The proof follows a reduction from low-regret learning to 
computing approximate minmax strategies [5] . It was shown there that if two players use "low 
regret" algorithms, then the empirical distribution over play will converge to the set of minmax 
strategies. However, instead of using the weighted majority algorithm, we use the "Follow the 
expected leader" (FEL) algorithm [TT]. That algorithm gives a reduction between the ability to 
compute best responses and "low regret." 

Note, for this section, we will use the fact that x^Mx' G [— C*, C] for C ~ B^nn' under our 
assumptions on K,K', and AI. We will extend the domain of v : R^o ^ ^-"o ~^ naturally 
by v{x,x') = x^Mx'. For x e [0,B]" and x' € [0,B]"', v{x,x') e [-C,C]. Additionally, for 
simplicity we will change the domains of O and O' to R>o and R>o, as follows. For any 
x' G R>o, we simply take 0{Bx' /\\x'\\oo) as the best response to x' (for x' = an arbitrary 
element of K, such as 0(0) may be chosen). This scaling is logical since a.rgma.Xx£K x^Mx' = 
argmax^^gif x^Max' for a > 0. By linearity in v, it implies that, for the new oracle O and any 
X G R>oi 

v{0{x'),x')>maxv{x,x')~e^^^^. (3) 

x^K B 

Similarly for C. 

Fix any sequence length T > 1. Consider T periods of repeated play of the duel. Let the 
strategies chosen by players 1 and 2, in period t, be xt and xj, respectively. Define the regret of 
a player 1 on the sequence to be. 



max^w(a;,a;;) - ^v{xt,Xt). 



x£K 

t=l t=l 

Similarly define regret for player 2. The (possibly negative) regret of a player is how much 
better that player could have done using the best single strategy, where the best is chosen with 
the benefit of hindsight. 

Observation 1. Suppose in sequence xi,X2, ■ ■ ■ ,xt o,nd x'l^ x'2, ■ ■ ■ , xip, both players have at 
most r regret. Let a = {xi + . . . + xt)/T, a' = {x'^ + . . . + x'rp)/T he the uniform mixed 
strategies over xi, . . . , xt, and x'l, . . . , x'j,, respectively. Then a and a' are e-minmax strategies, 
for e = 2r/T. 

Proof. Say the minmax value of the game is a. Let a = ^ v{xi,x[). Then, by the definition 
of regret, a > a — r/T, because otherwise player 1 would have more than r regret as seen by any 
minmax strategy for player 1, which guarantees at least an oT payoff on the sequence. Also, we 
have that, against the uniform mixed strategy over xi,. . . ,xt, no strategy can achieve payoff 
of at least a — r, by the definition of regret (for player 2). Hence, cr guarantees player 1 a payoff 
of at least a — 2r/T. A similar argument shows that a' is 2r/T-minmax for player 2. □ 

The FEL algorithm for a player is simple. It has parameters B,R > Q,N > 1 and also 
takes as input an e best response oracle for the player. For player 1 with best response orace 
O, the algorithm operates as follows. On each period t = 1,2,..., it chooses N independent 
uniformly-random vectors rti, rt2, ■ ■ ■ , rtN € [0, i?]™ . It plays. 



G K. 



The above is seen to be in K by convexity. Also recall that for ease of analysis, we have assumed 
that O takes as input any positive combination of points in K' . 
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Lemma 12. For any B, C, R, T,l3,e> 0, and any r e [0, i?]™', 

T T 

y^v{0{r + x[ +4 + ... + x't),x't) > max Vv(a;,a;;) - 2CR/ B - T{T + R/ B)e. 

t=l t=l 

The proof is a straightforward modification of Kalai and Vempala's proof [11] . What this is 
saying is that the "be the leader" algorithm, which is "one step ahead" and uses the information 
for the current period in choosing the current period's play, has low regret. Moreover, one 
can perturb the payoffs by any amount in a bounded cube, and this won't affect the bounds 
significantly. The point of the perturbations, which we will choose randomly, will be to make 
it harder to predict what the algorithm will do. For the analysis, they will make it so that "be 
the leader" and "follow the leader" perform similarly. 

Proof. Define yt = r + x[ + . . . + Xf_i. We first show, 

T T 

v{0{yi),r) + v{0{yt+i), x[) > v{0{yT+i), r) + ^ v{0{yT+i), x[) - T{T + R/B)e. (4) 
t=i t=i 

The facts that ||r||oo < R implies that v{x,r) £ [—CR/B,CR/B], and hence, 

CR/B + v{0{yt+i),Xt) > m&x \ v{x,r) v{x,x't) ] -T{T + R/B)e 

t=l \ t=l / 



- ^t)^ " ^'^^ + ^l^y - '^CR/B, 



which is equivalent to the lemma. We now prove @ by induction on T. For T = 0, we have 
equality. For the induction step, it suffices to show that. 



T-1 T-1 



v{0{yT). r) + Y xj) > v{0{yT+i),r) + ^ v{0{yT+i). x',) - [R/B + T)e. 

t=i t=i 

However, this is just an inequality between v{0{yT),yT) and v{(D{yT+i),yT), and hence follows 
from ([3]) and the fact that ||yT||oo/^ < R/B + T. Hence we have established ([4]) and also the 
lemma. □ 

Lemma 13. For any S >0, with probability > 1 — 2Te~^^ ^ , 

T T 

V v{xt,Xt) > max V v{x, x't) - 6CT - 2BCm'T/R - 2CR/B - T{T + R/B)e. 

t=l t=l 

Proof. It is clear that yt and yt+i are similarly distributed. For any fixed x'i,X2t . ■ ■ ,x'rp, define 
xt by, 

^* = 15^ / O (r + x'l + . . . + x[_^) dr. 

^ Jre[0,fl]™' 

By linearity of expectation and v, it is easy to see that E[a;t|a;']^, . . . , x't_i\ = Xt and, 

E[v{xt,Xt) I x[,...,Xt] =v{xt,Xt). 



By Cliernoff-Hoeffding bounds, since v{xt,Xt) £ [— C, C], for any 5 > 0, we have that with 
Pr[ \vixux't)-vixt,x[)\>SC \ x[,...,x't] <2e~'''"^ . 



probability at least 1 — e "^^'^ 
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Hence, by the union bound, Pr [ | J2t ^{^t,x'f) — v{xt,x'^\ > SCT] < 2T. 



The key observation of Kalai and Vempala is that Xt and Xt+i are close because the m'- 



x't_^ + [0, i?]™ and x'l + . . . + + [0, i?]" overlap 



significantly. In particular, they overlap in on all but at most a Bm'/R fraction [TT] of their 
volume. Since v is in [—1, 1], this means that \v{xt,x'f) — v{xt+i,x^)\ < 2BCm' / R. This follows 
from the fact that v is bilinear, and hence when moved into the integral has exactly the same 
behavior on all but a Bm'/R fraction of the points in each cube. This implies, that with 
probability > 1 - 2Te-'^^''^ , 

T T 

^v{xt,x^) >^v{xt+ux[) ~ SCT -2BCm'T/R. 
t=i t=i 

Combining this with Lemma [T^] completes the proof. □ 
We are now ready to prove Lemma [2l 

Proof of Lemma\M We take T = (^4C^max(m, m')/(3e)) R = B,/max(m, m')T and N = 

ln(4rC/J)/(2e^). As long as T > max(m, m'), R/B < T and hence Lemma [T^ implies that 
with probability at least 1 — 5, if both players play FEL then both will have regret at most 



eT + 4C-\/ max(m, m' max(m, m' 

Observation [T] completes the proof. □ 



B A Racing Duel 

The racing duel illustrates a simple example in which the beatability is unbounded, the optimiza- 
tion problem is "easy," but finding polynomial-time minmax algorithms remains a challenging 
open problem. The optimization problem behind the racing duel is routing under uncertainty. 
There is an underlying directed multigraph (V, E) containing designated start and terminal 
nodes s,i e V, along with a distribution over bounded weight vectors C Rfg, where uje 
represents the delay in traversing edge e. The feasible set X is the set of paths from s to t. The 
probability distribution p £ A(0) is an arbitrary measure over Q. Finally, c{x,ui) = X^eei: ^e- 

For general graphs, solving the racing duel seems quite challenging. This is true even when 
routing between two nodes with parallel edges, i.e., V = {s, t} and all edges E = {ei, 62, . . . , e„} 
are from s to t. As mentioned in the introduction, this problem is in some sense a "primal" 
duel in the sense that it can encode any duel and finite strategy set. In particular, given any 
optimization problem with \X\ = n, we can create a race where each edge Ci G E corresponds 
to a strategy Xi & X , and the delays on the edges match the costs of the associated strategies. 

B.l Shortest path routing is 1-beatable 

The single-player racing problem is easy: take the shortest path on the graph with weights 
We = Ei^^p[u)e]- However, this algorithm can be beaten almost always. Consider a graph with 
two parallel edges, a and 6, both from s to t. Say the cost of a is e/2 > with probability 1, and 
the cost of 6 is with probability 1 — e and 1 with probability e. The optimization algorithm 
will choose a, but b beats a with probability 1 — e, which is arbitrarily close to 1. 

B.2 Price of anarchy 

Take social welfare to be the average performance, W{x,x') = {c{x) + c{x'))/2. Then the price 
of anarchy for racing is unbounded. Consider a graph with two parallel edges, a and b, both 
from s to t. The cost of a is e > with probability 1, and the cost of 6 is with probability 3/4 
and 1 with probability 1/4. Then b a dominant strategy for both players, but its expected cost 
is 1/4, so the price of anarchy is l/(4e), which can be arbitrarily large. 
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C When Competing is Easier than Playing Alone 



Recall that the racing problem from Appendix [B] was "easy" for single-player optimization, yet 
seemingly difficult to solve in the competitive setting. We now give a contrasting example: a 
problem for which competing is easier than solving the single-player optimization. 

The intuition behind our construction is as follows. The optimization problem will be based 
upon a computationally difficult decision problem, which an algorithm must attempt to answer. 
After the algorithm submits an answer, nature provides its own "answer" chosen uniformly at 
random. If the algorithm disagrees with nature, it incurs a large cost that is independent of 
whether or not it was correct. If the algorithm and nature agree, then the cost of answering the 
problem correctly is less than the cost of answering incorrectly. 

More formally, let L C {0, 1}* be an arbitrary language, and let z e {0, 1}* be a string. Our 
duel will be D{X, il,p, c) where X = ft = {0, 1}, p is uniform, and the cost function is 



The unique optimal solution to this (single-player) problem is to output 1 if and only if z G L. 
Doing so is as computationally difficult as the decision problem itself. On the other hand, finding 
a minmax optimal algorithm is trivial for every z and L, since every algorithm has value 1/2: 
for any x' , v{l — x' , x') = Pi[uj ^ x'] — 1/2 — v{x' ,x'). 

D Asymmetric Games 

We note that all of the examples we considered have been symmetric with respect to the players, 
but our results can be extended to asymmetric games. Our analysis of bilinear duels in Section 
l2.1l does not assume symmetry when discussing bilinear games. For instance, we could consider 
a game where player 1 wins in the case of ties, so player I's payoff is Pr[c(a;,a;) < c{x',uj)]. One 
natural example would be a ranking duel in which there is an "incumbent" search engine that 
appeared first, so a user prefers to continue using it rather than switching to a new one. This 
game can be represented in the same bilinear form as in Section 12. 5[ the only change being a 
small modification of the payoff matrix M. Other types of asymmetry, such as players having 
different objective functions, can be handled in the same way. For example, in a hiring duel, 
our analysis techniques apply even if the two players may have different pools of candidates, of 
possibly different sizes and qualities. 




ff (x = a 

1 ii {x = u 

2 ii X =^ u 



1 and z £ L) OT {x = u = and z ^ L) 
1 and z 1^ L) OT {x = UJ ^ and z E L) 
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